Treatment of Multiword Expressions and Compounds in Bulgarian
نویسندگان
چکیده
The paper shows that catena representation together with valence information can provide a good way of encoding Multiword Expressions (beyond idioms). It also discusses a strategy for mapping noun/verb compounds with their counterpart syntactic phrases. The data on Multiword Expression comes from BulTreeBank, while the data on compounds comes from a morphological dictionary of Bulgarian.
منابع مشابه
The Far Reach of Multiword Expressions in Educational Technology
Multiword expressions as they appear as nominal compounds, collocational forms, and idioms are now leveraged in educational technology in assessment and instruction contexts. The talk will focus on how multiword expression identification is used in different kinds of educational applications, including automated essay evaluation, and teacher professional development in curriculum development fo...
متن کاملDomain-Dependent Identification of Multiword Expressions
The identification of different kinds of multiword expressions require different solutions, on the other hand, there might be domain-related differences in their frequency and typology. In this paper, we show how our methods developed for identifying noun compounds and light verb constructions can be adapted to different domains and different types of texts. Our results indicate that with littl...
متن کاملUsing Distributional Similarity of Multi-way Translations to Predict Multiword Expression Compositionality
We predict the compositionality of multiword expressions using distributional similarity between each component word and the overall expression, based on translations into multiple languages. We evaluate the method over English noun compounds, English verb particle constructions and German noun compounds. We show that the estimation of compositionality is improved when using translations into m...
متن کاملDistinguishing Subtypes of Multiword Expressions Using Linguistically-Motivated Statistical Measures
We identify several classes of multiword expressions that each require a different encoding in a (computational) lexicon, as well as a different treatment within a computational system. We examine linguistic properties pertaining to the degree of semantic idiosyncrasy of these classes of expressions. Accordingly, we propose statistical measures to quantify each property, and use the measures to...
متن کاملAutomatic acquisition of “noun+verb” idiomatic compounds in Korean*14
Song, Sanghoun. 2015. Automatic acquisition of “noun+verb” idiomatic compounds in Korean. Linguistic Research 32(1), 253-280. The state-of-the-art skills of computational linguistics pay attention to lexical semantics, because it has a potential to be used to improve language processing systems in terms of coverage as well as accuracy. In particular, utilizing multiword expressions is important...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014